Follicular: FL1018

Allen Zhang

October 29, 2018

Introduction

Data

We have data for 2 FL samples: FL1018T1 and FL1018T2. This patient has transformed FL.

Mitochondrial counts

Ribosomal counts

Filtering

We filter on a mitochondrial threshold of 10% and a ribosomal threshold of 60%. These thresholds were indicated in the previous plots.

Normalization

We normalize by:

  • Computing size factors with scran (after quickCluster)
  • Using scater::normalize
  • Running PCA, t-SNE, and UMAP for dimensionality reduction

t-SNE

UMAP

Total number of UMIs (PCA)

Cell cycle

We use cyclone to assign cells to cell cycle states.

Cell cycle (PCA)

Cell cycle (t-SNE)

Cell cycle (UMAP)

CellAssign (conservative)

Marker gene matrix

B cells T cells
CD19 1 0
CD28 0 1
CD3D 0 1
CD3E 0 1
CD3G 0 1
CD74 1 0
CD79A 1 0
CD79B 1 0
IGKC 1 0
IGLC2 1 0
IGLC3 1 0
MS4A1 1 0
TRAC 0 1

Cell assignments (MAP, PCA)

Cell assignments (MAP, t-SNE)

Cell assignments (MAP)

B cells T cells
3452 688

T-cell subclustering (phenograph, PCA)

T-cell subclustering (phenograph, UMAP)

CellAssign (full)

Marker gene matrix

B cells Cytotoxic T cells (activated) Cytotoxic T cells Naive/resting CD4 CD4 (activated) Tfh other
CCL5 0 1 1 0 0 0 0
CCR7 1 0 0 1 0 0 0
CD19 1 0 0 0 0 0 0
CD2 0 1 1 1 1 1 0
CD3D 0 1 1 1 1 1 0
CD3E 0 1 1 1 1 1 0
CD3G 0 1 1 1 1 1 0
CD4 0 0 0 1 1 1 0
CD69 0 1 0 0 1 1 0
CD74 1 0 0 0 0 0 0
CD79A 1 0 0 0 0 0 0
CD79B 1 0 0 0 0 0 0
CD8A 0 1 1 0 0 0 0
CD8B 0 1 1 0 0 0 0
CXCR5 1 0 0 0 0 1 0
EOMES 0 1 1 0 0 0 0
GZMA 0 1 1 0 0 0 0
ICA1 0 0 0 0 0 1 0
ICOS 0 0 0 0 0 1 0
IFNG 0 1 0 0 0 0 0
IGKC 1 0 0 0 0 0 0
IGLC2 1 0 0 0 0 0 0
IGLC3 1 0 0 0 0 0 0
IL7R 0 0 0 1 1 0 0
MS4A1 1 0 0 0 0 0 0
NKG7 0 1 1 0 0 0 0
PDCD1 0 0 0 0 0 1 0
SELL 1 0 0 1 0 0 0
ST8SIA1 0 0 0 0 0 1 0
TNFRSF4 0 0 0 0 0 1 0
TRAC 0 1 1 1 1 1 0

Cell assignments (MAP, UMAP)

Probabilities

T-cell unsupervised clusters (UMAP)

T-cell subtype assignments (UMAP)

  • Why is vimentin not expressed in the Tfh cells?
  • Why is CXCR4 expressed in the memory + Tc1 populations?

Malignant vs. nonmalignant B cell identification

B cells (UMAP)

B cell unsupervised clustering (phenograph, UMAP)

IGKC expression

IGLC2 expression

IGLC3 expression

Manual identification of malignant vs. nonmalignant clusters (UMAP)

B cell subclusters (malignant)

Introduction

We can look for subclusters of B cells, e.g. memory vs. naive, within the malignant population.

Malignant B cell unsupervised clustering (phenograph, UMAP)

IgG-expressing cells

Similar results for other IgG chains. No IgM, IgD, or IgA expression.

IgE-expressing cells

For whatever reason, T1 cells express IgE.

Cluster 0: Non-IgE cluster

Cluster 1: Non-IgE cluster

Cluster 2: IgE cluster

Cluster 3: Non-IgE cluster

Cluster 4: Proliferating cluster

NOTE: This is predominantly present in T2 as opposed to T1. These cells correspond to the S phase cells observed in T2.

Cluster 5: DUSP1-expressing cluster

What’s going on in this cluster?

Cluster 6: Low ribo cluster

B cell subclusters (nonmalignant)

Introduction

We can look for subclusters of B cells, e.g. memory vs. naive, within the nonmalignant population.

Nonmalignant B cell unsupervised clustering (phenograph, UMAP)

We don’t really have enough cells here to do anything meaningful.

Celltype proportion changes between timepoints

Introduction

We’ll look to see how celltype proportions change between timepoints.

Overall proportions

FL1018T1 FL1018T2
B cells 7.65 0.44
B cells (malignant) 56.98 92.34
CD4 (activated) 4.91 1.80
Cytotoxic T cells 4.21 0.70
Cytotoxic T cells (activated) 2.32 2.95
Naive/resting CD4 8.77 0.33
other 0.35 0.07
Tfh 14.81 1.36

T cell percentages

FL1018T1 FL1018T2
Cytotoxic T cells (activated) 6.61 41.24
Cytotoxic T cells 12.02 9.79
Naive/resting CD4 25.05 4.64
CD4 (activated) 14.03 25.26
Tfh 42.28 19.07

B cell percentages

FL1018T1 FL1018T2
B cells 11.83 0.48
B cells (malignant) 88.17 99.52

The nonmalignant fraction of B cells becomes virtually nonexistant in T2.

Differential expression between malignant and nonmalignant B cells

Introduction

We’ll look at genes that are differentially expressed between malignant and nonmalignant B cells. We treat timepoint as a blocking variable.

In the plots that follow, genes that are upregulated in malignant cells relative to nonmalignant cells have positive logFC values.

Pathways

Genes

Some key genes for which we see consistent observations are BCL2, BCL6, and ID-2.

Refer to http://www.bloodjournal.org/content/99/1/282 for a more comprehensive list of markers to check.

Table of significant genes

Key points

  • Upregulation of BCL2 and BCL6 in malignant cells
  • Upregulation of ID-2 in malignant cells

Differential expression within celltypes

Introduction

For each celltype, we ask what genes and pathways are differentially expressed between timepoints. In the plots that follow, genes that are upregulated in T2 relative to T1 will have positive logFC values.

CD8 T cells (pathways)

CD8 T cells (genes)

Upregulation of CD8 T effector activity in T2?

CD8 T cells (table of significant entries)

Malignant B cells (pathways)

Malignant B cells (genes)

Downregulation in antigen presentation – perhaps due to corresponding upregulation of T effector response in T2?

Malignant B cells (table of significant entries)

CD4 T cells (pathways)

Nothing significant.

CD4 T cells (genes)

Looks like heat shock proteins HSPA1A, HSPA1B, and HSPH1 are downregulated in T2. May be an artifact of processing. We can see if this is consistent with other celltypes. Note that CD69 is also upregulated in T2 relative to T1 – activation of CD4 T cells in T2 as well. We need to give a thought to the CXCR4 population though. Are those Tfh’s or just T mem’s? Not well described in the literature as far as I’m aware.

CD4 T cells (table of significant entries)

T follicular helper cells (pathways)

Nothing significant.

T follicular helper cells (genes)

Again, we see the heat shock response downregulated in T2 relative to T1. However, JUN is also upregulated in T2 – could be biological or batch-associated. Once again – CD69 is upregulated in T2.

T follicular helper cells (table of significant entries)

Nonmalignant B cells (pathways)

Nothing significant. However, do notice that the number of nonmalignant B cells has dropped quite precipitously, especially in relation to the discordant rise in malignant B cell prevalence.

Nonmalignant B cells (genes)

No HLA loss or anything of much interest here, I think. TCL1A and FCER2 might be interesting though, with more FL knowledge.

Nonmalignant B cells (table of significant entries)

Cycling cells

Counts (by celltype)

G1 G2M S
B cells 77 6 38
B cells (malignant) 1844 148 1327
CD4 (activated) 90 10 19
Cytotoxic T cells 56 4 19
Cytotoxic T cells (activated) 77 9 27
Naive/resting CD4 87 9 38
other 6 0 1
Tfh 155 23 70

Percentages (by celltype)

G1 G2M S
B cells 63.64 4.96 31.40
B cells (malignant) 55.56 4.46 39.98
CD4 (activated) 75.63 8.40 15.97
Cytotoxic T cells 70.89 5.06 24.05
Cytotoxic T cells (activated) 68.14 7.96 23.89
Naive/resting CD4 64.93 6.72 28.36
other 85.71 0.00 14.29
Tfh 62.50 9.27 28.23

As expected (perhaps), cancer cells have the highest percentage of cycling cells. Overall though, we may be overestimating the percentage of cycling cells – check and see what is going on with cyclone here.

Counts (by sample)

G1 G2M S
FL1018T1 928 80 417
FL1018T2 1464 129 1122

Percentages (by sample)

G1 G2M S
FL1018T1 65.12 5.61 29.26
FL1018T2 53.92 4.75 41.33

There’s a higher percentage of cycling cells in T2 – consistent with disease transformation and worsening patient prognosis.

Generalized linear model

We can go deeper into investigating this and see if ALL cells in T2 have higher cycling percentages, or only cancer cells.

Fixed effects: timepoint, celltype Interaction terms: none (a timepoint*celltype interaction term would overparametrize)

Response family: binomial

Residuals plot

Normality testing is unnecessary as we do not assume it (and it is generally useless to begin with).

Model summary

## 
## Call:
## glm(formula = cbind(cycling_count, total_count - cycling_count) ~ 
##     dataset + celltype_full, family = "binomial", data = cycling_summary)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -1.67774  -1.02843   0.00414   0.60776   2.44142  
## 
## Coefficients:
##                                            Estimate Std. Error z value
## (Intercept)                                -0.60241    0.18952  -3.179
## datasetFL1018T2                             0.41404    0.07648   5.414
## celltype_fullB cells (malignant)            0.06410    0.19909   0.322
## celltype_fullCD4 (activated)               -0.71119    0.28731  -2.475
## celltype_fullCytotoxic T cells             -0.39328    0.31269  -1.258
## celltype_fullCytotoxic T cells (activated) -0.45773    0.28154  -1.626
## celltype_fullNaive/resting CD4             -0.04250    0.26214  -0.162
## celltype_fullother                         -1.32038    1.09929  -1.201
## celltype_fullTfh                            0.02755    0.23057   0.119
##                                            Pr(>|z|)    
## (Intercept)                                 0.00148 ** 
## datasetFL1018T2                            6.17e-08 ***
## celltype_fullB cells (malignant)            0.74749    
## celltype_fullCD4 (activated)                0.01331 *  
## celltype_fullCytotoxic T cells              0.20849    
## celltype_fullCytotoxic T cells (activated)  0.10399    
## celltype_fullNaive/resting CD4              0.87120    
## celltype_fullother                          0.22970    
## celltype_fullTfh                            0.90491    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 91.133  on 15  degrees of freedom
## Residual deviance: 17.761  on  7  degrees of freedom
## AIC: 102.7
## 
## Number of Fisher Scoring iterations: 4

Coefficients are going the right way. Not exactly surprising that most of the coefficients are nonsignificant – after all, we don’t have many data points at all.

An observation we may want to note is how the percentage of cycling malignant B cells vs. other celltypes changes from T1 to T2.

T1

G1 G2M S
B cells 65.14 2.75 32.11
B cells (malignant) 64.90 4.93 30.17
CD4 (activated) 74.29 7.14 18.57
Cytotoxic T cells 71.67 3.33 25.00
Cytotoxic T cells (activated) 54.55 9.09 36.36
Naive/resting CD4 65.60 6.40 28.00
other 100.00 0.00 0.00
Tfh 61.61 9.00 29.38

T2

G1 G2M S
B cells 50.00 25.00 25.00
B cells (malignant) 52.53 4.31 43.16
CD4 (activated) 77.55 10.20 12.24
Cytotoxic T cells 68.42 10.53 21.05
Cytotoxic T cells (activated) 73.75 7.50 18.75
Naive/resting CD4 55.56 11.11 33.33
other 50.00 0.00 50.00
Tfh 67.57 10.81 21.62

So, the only celltype for which we have a decent number of cells (see next slide) and have a substantial increase in cycling cells is in the B cell population. So indeed, the B cell population is becoming more actively cycling in T2 while the other populations aren’t.

T2 cell counts

celltype_full count
B cells 12
B cells (malignant) 2507
CD4 (activated) 49
Cytotoxic T cells 19
Cytotoxic T cells (activated) 80
Naive/resting CD4 9
other 2
Tfh 37

Plot by celltype